Implement Iterator for Bound<'py, PySequence> #3927

LilyFoote · 2024-03-02T23:44:36Z

When working on #3923 I noticed that we didn't have an implementation of Iterator for PySequence. I'm not 100% certain this is the right thing to add, since it hides the fallibility of len and get_item with .expect calls. But a properly implemented Python Sequence shouldn't raise in these, so maybe the ergonomics justify allowing panics here?

If we want this, I think now would be a good time to add it, since the changes to error handling it introduces would be straightforward as part of the general Bound migration.

davidhewitt · 2024-03-04T21:44:40Z

Thanks for proposing this!

I think it's definitely an open design question. There's some possibility already for the set and frozenset iterators to panic in the event of a poorly-implented subclass. That said, I think poorly-implemented sequences seems more likely. Is that a problem? I'm not sure.

I'd be interested to know what was the original motivation. Was it completeness? Do you use this one downstream? Is this noticeably faster than going via Bound<PyIterator>?

While it's true that this could be a good time to add it, I'm also tempted to argue the opposite, and that (where possible) we should be aiming to keep the differences between the Bound and GIL Ref APIs as small as possible.

I think adding this in a future PyO3 would be breaking but might not be that hard to fixup in isolation, so I'm not super worried if this doesn't make it in 0.21.

LilyFoote · 2024-03-04T22:09:39Z

I think poorly-implemented sequences seems more likely. Is that a problem? I'm not sure.

Yeah, me neither.

I'd be interested to know what was the original motivation. Was it completeness?

Pretty much this - I was looking through the pyo3 codebase for any uses of .iter() in a for loop to see if they could be simplified.

Do you use this one downstream?

No.

Is this noticeably faster than going via Bound?

I haven't tested this.

While it's true that this could be a good time to add it, I'm also tempted to argue the opposite, and that (where possible) we should be aiming to keep the differences between the Bound and GIL Ref APIs as small as possible.

Fair.

I think adding this in a future PyO3 would be breaking but might not be that hard to fixup in isolation, so I'm not super worried if this doesn't make it in 0.21.

Avoiding the need for a breaking change was the main motivation for suggesting it for 0.21.

davidhewitt · 2024-03-06T09:55:29Z

I've been thinking about this more. pydantic-core does iterate sequences in a few places, we could upgrade very straightforwardly to use infallible iteration. But I'm still uneasy that infallible iteration is correct; panics tend to upset Pydantic users.

Maybe the right way forward here is to just make the iteration fallible? It would at least avoid the need for the .iter() call which you noticed. That would remove my main concern tbh.

A related question would be whether any other types deserve iterators too. PyString, PyBytes, PyMapping?

This allows using a `Bound<'py, PySequence>` directly in a for loop.

davidhewitt

Sorry for the long pause here; I'd missed that you pushed the fallible iteration.

In light of #4553, this might now have more space in the API as a clearly separate method from .try_iter().

That said, I'm a little uneasy about the move away from Python iteration here, and the subtleties of the breaking change. e.g. what happens if the sequence implements __iter__ and does something a bit... exotic?

Perhaps let's revisit in 0.24 when users should have migrated off .iter(), so it's less likely to be an in-place breaking change.

davidhewitt · 2024-10-04T11:58:51Z

src/types/sequence.rs

+    unsafe fn get_item(&self, index: usize) -> PyResult<Bound<'py, PyAny>> {
+        self.sequence.get_item(index)
+    }


Suggested change

unsafe fn get_item(&self, index: usize) -> PyResult<Bound<'py, PyAny>> {

self.sequence.get_item(index)

}

fn get_item(&self, index: usize) -> PyResult<Bound<'py, PyAny>> {

self.sequence.get_item(index)

}

LilyFoote added 2 commits March 22, 2024 23:09

Implement Iterator for Bound<'py, PySequence>

cc75742

This allows using a `Bound<'py, PySequence>` directly in a for loop.

Make BoundSequenceIterator iteration fallible

1e6fcf8

LilyFoote force-pushed the pysequence-iter branch from cfed22f to 1e6fcf8 Compare March 22, 2024 23:13

davidhewitt reviewed Oct 4, 2024

View reviewed changes

davidhewitt added this to the 0.24 milestone Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Iterator for Bound<'py, PySequence> #3927

Implement Iterator for Bound<'py, PySequence> #3927

LilyFoote commented Mar 2, 2024

davidhewitt commented Mar 4, 2024

LilyFoote commented Mar 4, 2024

davidhewitt commented Mar 6, 2024 •

edited

Loading

davidhewitt left a comment

davidhewitt Oct 4, 2024

Implement Iterator for Bound<'py, PySequence> #3927

Are you sure you want to change the base?

Implement Iterator for Bound<'py, PySequence> #3927

Conversation

LilyFoote commented Mar 2, 2024

davidhewitt commented Mar 4, 2024

LilyFoote commented Mar 4, 2024

davidhewitt commented Mar 6, 2024 • edited Loading

davidhewitt left a comment

Choose a reason for hiding this comment

davidhewitt Oct 4, 2024

Choose a reason for hiding this comment

davidhewitt commented Mar 6, 2024 •

edited

Loading